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Abstract. Modern approaches to the search of Relative and Global min- 
ima of potential energy function of Biomacromolecular structures include 
techniques of combinatorial optimization like the study of Steiner Points 
and Steiner Trees. These methods have been successfully applied to the 
problem of modelling the configurations of the average atomic positions 
when they are disposed in the usual sequence of evenly spaced points 
along right circular helices. In the present contribution, we intend to 
show how these methods can be adapted for explaining the advantages 
of introducing the concept of a Steiner Ratio Function (SRF). We also 
show how this new concept is adequate for fitting the results obtained 
by computing experiments and for providing an improvement to these 
results if we use the restriction of working with Full Steiner Trees. 

1 Introduction 

The study of Steiner Trees was shown to be useful for understanding the struc- 
ture of biomacromolecules QIEIEIEEIj through the application of combinatorial 
optimization techniques. These methods have been usually applied in the mod- 
elling of configurations of evenly spaced points along right circular helices since 
these are adequate to fit the average atomic positions. Some writers have failed 
in their efforts at correlating the Steiner Ratio with the potential energy of a 
given special configuration. They didn't succeed in this trial to develop a robust 
method of Global Optimization. In the present work we propose to extend the 
concept of a Steiner Ratio into that of a SRF 6 . This was possible by working 
with experimental data obtained from a modification of the W. D. Smith's algo- 
rithm UJ, which was adapted to work with only one topology. In our experiments, 
we have followed the prescription of using the topology of the sausage configura- 
tion |S] . The study of the candidates for SRF is part of a full geometric approach 



to the problem of macromolecular structure. It has deep connections with the 
problem of a good definition of chirality measure 9 , as well as it provides nice 
insights in the understanding of the homochirality phenomena and its impor- 
tance for the stability of macromolecular configurations. It seems that chirality 
of biomacromolecules has so a fundamental importance that its study deserves 
the use of more powerful methods than those usually considered by Combina- 
torial Optimization. In our first approach, we proposed a chirality function as 
a constraint in a thermodynamically inspired idea of constructing a new Cost 
Function as a Gibbs Free Energy |UJ ^ . In a more elaborate theory, the aspects 
of potential energy of the configuration and its chirality should come from a 
self-contained Steiner Function. 

There will be no need for a constraint. At the present stage of our research, 
we have to pave the way for this future theory by undertaking the study of pos- 
sible candidates for SRF. We have also taken into consideration the restrictions 
imposed by the natural requirement of full Steiner Trees. 



2 The Steiner Ratio 

Let us consider a finite set of points A in a metric manifold M. We consider 
all the possible ways (s-topologies) of connecting pairs of points on each set of 
this manifold. The resulting edges are supposed to be geodesies of the manifold 
and their collection is a tree. We get a spanning tree (SP) by discarding the 
edge of greatest length. Among these spanning trees of the set A with different 
s-topologies and length Zsp(sj A), there is one which overall length is a minimum 
as compared to all the trees of the same set. This is the minimum spanning tree 
of the set A, MST(A), and its length is 

Imst{A) = min lsp(a,A) • (2.1) 

(s— topologies) 

If we now allow for the introduction of additional points on each set A of the 
manifold in order to have spanning trees of smaller overall length, we shall have 
the concept of a Steiner tree (ST). In the construction of these trees we have 
to follow the additional requirement of the tangent lines to the geodesic edges 
meeting at 120° on each Steiner (additional) point (i-topologies) . Among these 
Steiner trees of the set A with different i-topologies and length lsr(t, A), there 
is one which overall length is a minimum. We call it the Steiner Minimal Tree 
of the set A, SMT(A). Its length is given by 

Ismt{A) = min 1 S t(M) • (2.2) 

(t— topologies) 

The MST(^4) is considered as the worst approximation (the "worst cut") to 
the SMT(A). It is usual to associate a number to this couple of minimal trees of 
the set A: the ratio of their overall lengths. This is calculated with the definition 
of distance of the manifold M. This number is called the Steiner Ratio of the 
set A c M. We write, 

, A x ^SMt(^) (r> o\ 

'MST(.A) 



The Stcincr Ratio of the manifold pu is then defined to be the infimum of 
the sequence of values p(A), or 



PM = A inf M p(A) . (2.4) 
3 The Steiner Ratio Function 

The concept of a Steiner Ratio Function is a specialization of the definitions 
above. It is better introduced by examples. We now suppose that all the sets 
A of the manifold M have the same number n of points. Moreover, we also 
take these points Pi to be evenly spaced points along a right circular helix. The 
cartesian coordinates of the sequence of consecutive points is given by 

Pi (cos iiv, sin iu>, aiui), 0<i<n—l (3-1) 

where 2ira is the pitch of the helix. 

Equation (3.1) means that we have a helical point set for each pair of values 
{lo, a). We now consider the subsequences of points obtained from (3.1) by skip- 
ping points. They are of the form Pj+ki, with j = 0, 1, 2, . . . , k — 1; (k — 1) = 
number of skipped points; I = number of intervals of skipped points before the 
present point. There are n possible sequences. Among these only k of them con- 
tain different points. The process of construction of these subsequences is the 
following: 

Given a number of n points evenly spaced along a right circular helix, we 
form a j— subsequence with (A; — 1) skipped points, and a maximum I— value. 
Since the index (j + Ik) should be restricted by 



We have 



In 



lk<n-\ . (3.2) 
n-j- 1" 



k 



(3.3) 



Where [.] stands for the greatest integer value. A j— subsequence is then 
given by 

^ P ^kl m .^ '■ P j,Pj+h,Pj+2k, Pj+lm ax k ■ (3-4) 

We also define a sequence which is the union of all the j— subsequences by 
defining connection edges of the highest end and/or lowest end consecutive points 

Some examples will be worth to illustrate the results above. 
1. Let n = 23, k = 2, j = 0,1. 
We shall have 

W0' = 0) = ll,Z max (i = l) = 10 



and 



Pj+i^k = P22, P21 , respectively. 

The subsequences are 

(-^0)2,11 : Po, P2,Pi, Pe, Ps, Pw, P12, Pu, Pie, Pis, P20, P22 (3.6) 
(^1)2,10 : P^> ^3, P5, P7, P9, Pu, P13, P15, Pn, P19, P21 ■ (3.7) 

2. Let n = 23, k = 3, j = 0, 1,2. 
We have 

«max(j = 0) = 7, l max (j = 1) = 7, ; max (j = 2) = 6 

and 

^j+imaxfc = -P21, -P22, -P20 • 



The subsequences are now 



(-^0)3,7 '■ P01 P31 Pq, P9, P12, Pi5, Pis, P21 (3-8) 

(Pi) 3,7 : P 1 ' P 7 > P 1C >> P 13 > ^ 19 > P" 22 (3-9) 
(P 2 ) 3 6 : P 2 , P 5 , P 8 , Pi4, P17, ^20 • (3.10) 



The sequence corresponding to (3.1) can be recovered by making j = 0, k = 1 
in the process described above, it can be also written as (Po) 1 („_i)- 

To each sequence of the form (3.5) corresponds a spanning tree SP(k,u>,a). 
In Fig.l, we show the sequence of points of (3.1) and of the two examples above, 
respectively. 




Fig. 1. (A) The sequence n = 23, k = 1, j = 0; (B) The union of the sequences 
n = 23, k = 2, j = and n = 23, fc = 2, j = 1; (C) The union of the sequences 
n = 23, k = 3, j = 0; rt = 23, fc = 3, j = 1 and n = 23, k = 3, j = 2. 



In the following, we take M to be the M 3 manifold with an Euclidean defi- 
nition of distance, or E 3 . 

The coordinates of the points Pj+ki are given analogously to (3.1). The Eu- 
clidean length of n— point configurations like those of Fig.l above can be written, 



l SP {k,uj,a) = {n-k)\/k 2 a 2 u! 2 +A k + 1 + (fc - 1) yja 2 w 2 + A x + 1 (3.11) 
where 

A k = l - 2cos(fcw) . (3.12) 

There is only one Steiner Tree for all these point configurations. It has the 
3-sausage's topology ^U] an d its Euclidean length [Hj OH E] is written as 

l ST (to,a) = (n-2)(l-r) + (n~3)aLO^^- + 2^/a 2 oj 2 + (1 - r) 2 + r(A 1 + 1) 

(3.13) 

where 

r(u,a) = , aUJ . (3.14) 

The Steiner points are also in a helix of the same pitch but smaller radius 
r(cu, a) as compared to the points of the configuration (3.1). 



In order to adapt the formulae (2.1) and (2.2) to the present case of a SRF 
definition we should note that if we take n ^S> k, all the sets A C E 3 should be 
considered as the same. This is a set of a great number of points evenly spaced 
in a right circular helix. 

For n > fc, we then have, from (2.2), (3.13) and (3.14), 



/sxnU'.nj ;/ j I - njy j— [j (3.15) 



and from (2.1), (3.11) and (3.12), 



lMST(k, to, a) = ny k 2 a 2 uj 2 + A k + 1 . (3.16) 
The Steiner Ratio Function is now defined, according (2.3), (3.15) and (3.16) 



as 



1 + auj 



A, 



p(u,a) = v / 1+1 ■ (3.17) 

min [yk a u> + A^ + 1) 

k 

The "min" in the equation above should be understood in the sense of a 

k 

new function formed in a piecewise way from the functions corresponding to the 
chosen fc— values. For k = 1, 2, 3 this means 

min (y a 2 0J 2 + A x + 1, Aa 2 uj 2 + A 2 + 1, \/§u 2 u 2 TA^+Tj . 

The associated Steiner Ratio for these helical point configurations will be 
given by the Global minimum of the function p(u>, a) above, according (2.4). 



4 The Fitting of Computational Results of the W. D. 
Smith's Algorithm 

In Fig. 2 below, we represent a section of the SRF given by (3.17) for a = = 
0.26454000216 . . .. This is the pitch of a helical configuration in which the points 
are regular tetrahedra vertices. These tetrahedra are glued together at common 
faces to form a 3-sausage configuration • The condition for equal edges of the 
tetrahedra lead to the following equations: 

au= ^-^A,-A 2 , (4.1) 



auj = —V^2 - A 3 , (4.2) 



oluj = — y/Ax- A 3 . 



(4.3) 



Only two of them are independent and the non-trivial solution is 



uj = 7r-arccos (^] = 2.30052398302 . . . , a = — — , 9N , = 0.26454000216 . . . 

\3y 9(7r - arccos 

(4.4) 

We take k = 1, 2, 3 and the points obtained from the W. D. Smith's algorithm 
[7] with a search space reduction, i.e., adopting the 3-sausage's topology as the 
only feasible. The modified algorithm is available at the site www.biomat.org. 




Fig. 2. The section a — or of the surface p(ui, a) ( — ) for k — 1,2,3 with points 
obtained from computing experiments (o). 



It should be noted from Fig. 2 that the function (3.17) is a very good fitting 
to the experiments which were done with the W. D. Smith's algorithm. It is so 
a good fitting, that it has the same bad performance at some tu— region (p > 
1!) as can be seem in Fig. 2 above. Although these can be easily explained by 
the difficulty of working with a quasi-plane configuration (the neighbourhood 
of to = 7r) when we choose a priori 3-dimensional configurations, we prefer to 
circumvent this difficulty in the next section. We do it by introducing some 
necessary restrictions related to Full Steiner Trees. 

From the viewpoint of the search of a Global minimum, the surface given by 
(3.17) is also a very good candidate as can be seen from its level curves in Fig. 3 
below 
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Fig. 3. Level curves of the surface p(u>, a), (3.17). The region which contains the 
projection of the Global minimum point is pointed out by the arrow. 

If we substitute into (3.17) the values (cu, a) corresponding to the 3-sausage's 
configuration, or uj = tt — arccos(2/3), a = v30/9(7r — arccos(2/3)), we get 

p = ^(3V3 + \/7) = 0.78419037337... (4.5) 

which is the value given by the authors of ^21 f° r the best upper value for 
the Steiner Ratio of the i? 3 -manifold. 

5 The Restriction to Full Steiner Trees 

In the last section there is no restriction to Full Steiner Trees. We take as an as- 
sumption that the natural organization of biomacromolecular structure is biased 
by Full Steiner Trees. A structure is non-degenerate or it is almost completely de- 
generate. Nature can provide a process in which the Steiner Ratio is approaching 
continuously to the value p = l. However, the tree cannot be partially degenerate 
in the sense of partial full trees connected at degenerate vertices. 

In this section, we emphasize that we need more stringent constraints, instead 
of a constraint relaxation for a good definition of SRF. "Good" means here the 
definition which leads to p— values lesser than the value reported in Section 4 if 
we believe that the main conjecture of the authors of 8 could be disproved. 

If we look at Fig. 2 above, we can see that the algorithm used as well as the 
modelling based on (3.17) do not make any discrimination to degenerate Steiner 
Trees. This is due to the fact that there are regions of w- values in which p is close 
to 1. To each (u>, a) pair of values there is associated a point configuration. This 
means that there are regions in the line (w, qr) corresponding to degenerate 



configurations. This is also valid for other a— values since these sections of the 
surface (3.17) have similar profiles. 

We can now introduce the restriction to full Steiner Trees. Let us take the j, 
k subsequence of (3.4). The points Pj, Pj+k, Pj+2k, ■ ■ ■ are evenly spaced along 
the corresponding right circular helix. The angle made by contiguous edges is 



CQzQk = P 3+lkP 3 + (l+l)k ■ P j+ (l +2 )kPj + (l+l)k ^ ^ 

\\Pj+lkPj+(l+l)k\\\\Pj+(l+2)kPj+(l+l)k\\ 

where || • || is the Euclidean norm. 

The cartesian coordinates of the points Pj+ki can be written analogously to 
(3.1) and we have 

[A k + l) 2 

C ° S ^ = - 1+ 2 (fc W + i + l) (5 ' 2) 

where A k is given by (3.12). 

The restrictions to Full Steiner Trees can be then written in the form 

cos6» fc > -1 . (5.3) 

In Fig. 4 below we show the restrictions introduced by (5.3) for k = 1,2,3. 
The horizontal line corresponds to the value cos6* fe = —1/2. 




The spanning trees corresponding to k — 2 have a large forbidden region. 
The k = 1 curve is the only one which corresponds to Full Steiner Trees in a 



large region of the uj— interval studied in this work. The proposal for the Steiner 
Ratio Function is given in this case by 



/ x 'smt(w,c«) 

Pi{v,a) = 

'mst(I) oc) 



(5.4) 



From (3.17), we get 



1 + auj 



^a 2 u 2 + Ax + 1 



(5.5) 



Fig. 5 shows the sections a = «r of the surfaces given by (3.17) and (5.5). 
By restricting our observations to the lo— interval in which the k = 1 curve in 
Fig. 4 allows for Full Steiner Trees, it should be noted that the section a = «r of 
the candidate for a SRF given by (5.5) is the convex envelope ^21 of the section 
p(w,or) of (3.17). 
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Fig. 5. The pi(w, ap,) candidate is a convex envelope of the function p(w, «r). 



6 Concluding Remarks 

The restriction of the point configurations only to those which lead to Full Steiner 
Trees, has wiped out the Global minima of the surface p(ui,a). The resulting 
surface pi(w, a) satisfies the necessary bounds 3 



- < pi[ui,a) < 1 



(6.1) 



in the u— interval 



arccos ( — | < w < 2n — arccos [ — | (6.2) 

If we wish to restrict the p-values by using the Du-Hwang's greatest lower 
bound (p = VS/3) instead of Moore's (p = 1/2), we have to substitute (6.2) by 



arccos 



— j < oj < 2n — arccos (— ) ((>..'->') 



We need a constraint to be motivated by the natural organization of a macro- 
molecular structure in our modelling. Some interesting results have been ob- 
tained |1] by introducing a function of the form 

H{lo, a) = (1 + A)pi(u>, a) - \<j>{uj, a) (6.4) 

where A is a Lagrange multiplier and 4>{oj, a) stands for a recently proposed 
function for chirality measure given by 

0(w,a) = \ojusw(u) ( (a 2 uj 2 - + 1)). (6.5) 



6 v ' \ A x 

The work with (6.4) has produced a new upper bound for pi(uj,a) which 
is still lower than the unconstrained minimum of pi(w,(XyC) as compared to the 
Global minimum value of p(to, a). 

It would be easy to announce a proof for the conjecture of or a t least some 
very good arguments for a proof from the work of the last sections. However, 
in spite of the evidences which were reported here, we think that the approach 
adopted has given nice advances in our program of modelling the structure of 
biomacromolecules with Steiner Points and Steiner Trees. The possibility of an- 
alyzing the chirality effects on these structures is one of these advances. 
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